Research on the innovative application of Shen Embroidery cultural heritage based on convolutional neural network

In order to protect intangible cultural heritage and promote outstanding cultural works, this article introduces innovative research on Shen Embroidery using convolutional neural networks. The dataset of Shen Embroidery was preprocessed to augment the data required for experimentation. Moreover, the approach of transfer learning was introduced to fine-tune the recognition network. Specifically, Spatial Pyramid Pooling (SPP) is employed by replacing the avg pool in the MobileNet V1 network, achieving the fusion of local and global features. The experimental results showed that the improved MobileNet V1 achieved a recognition accuracy of 98.45%, which was 2.3% higher than the baseline MobileNet V1 network. The experiments demonstrated that the improved convolutional neural network can efficiently recognize Shen Embroidery and provide technical support for the intelligent development of intangible cultural heritage.

www.nature.com/scientificreports/healthcare 13 .Currently, there are various architectures and variants of CNNs that enhance deep learning performance.These variants, such as VGGNet 14 , ResNet 15 , and Xception 16 , exhibit significant improvements in image classification accuracy and have achieved great success in many image classification tasks.In the realm of intangible cultural heritage preservation, researchers have begun integrating deep learning techniques into cultural inheritance and protection.Chen et al. 17 proposed a Cantonese opera Genre Classification Networks (CoGCNet) model for classifying Cantonese opera singing types, combining a bi-layer long short-term memory network (LSTM) with a conventional neural network (CNN) to enhance the contextual relevance between signals.This approach achieved intelligent classification management of Cantonese opera data with an accuracy of 95.69%.Zhou et al. 18 applied deep learning and transfer learning techniques to embroidery images, fine-tuning the Xception model for classification and recognition of embroidery images, addressing the problem of insufficient embroidery data collection in traditional Chinese embroidery.Wang et al. 19 introduced an innovative design method for willow pattern motifs, using ResNet to establish an image recognition model for Funan willow pattern, contributing to the sustainable development of willow craftsmanship culture.Experimental results showed that ResNet achieved the best recognition rate of 94.36% for the entire image dataset, with a recognition rate of 95.92% for modern patterns and 93.45% for traditional willow patterns.Yu et al. 20  In order to effectively address the issues of the limited quantity of Shen Embroidery, lack of datasets, and the labor-intensive process of manually selecting Shen Embroidery for classification and recognition.This study applies artificial intelligence technology to Shen Embroidery.Convolutional neural networks were used to recognize Shen Embroidery, assisting researchers in better studying Shen Embroidery and further protecting and inheriting intangible cultural heritage.The specific work in this study is as follows: 1. Enhancing and expanding the dataset through image processing techniques.2. Experimenting and comparing five different image classification networks, followed by analysis and discussion.3. Fine-tuning the classification network using transfer learning and conducting analysis and discussion.4. Replacing "avg pool" with "Spatial Pyramid Pooling" (SPP) and analyzing the improved accuracy.
The application of these steps aimed to improve the classification and recognition of Shen Embroidery using artificial intelligence, providing valuable insights for the preservation and research of intangible cultural heritage.

Dataset and experimental environment
The majority of the experimental dataset in this study was obtained from the Shen Embroidery Museum located in Nantong, Jiangsu Province, China.Additionally, a portion of Shen Embroidery was collected through web scraping to augment the dataset.The dataset consists of a total of 1264 Shen Embroidery images with a uniform size of 224 × 224 pixels and in JPG format.Example images of the dataset were shown in Fig. 1.Before conducting the experiments, data augmentation techniques such as flipping, rotating, and color variation were applied to expand the dataset.The enhanced data images are depicted as shown in Fig. 2. On average, each image was augmented 15 times, resulting in a total of 18,960 augmented Shen Embroidery.The dataset was divided into a training set and a validation set in a 9:1 ratio for the experimental training process.The training set served as the input for network training.The training set is utilized as the network's training data.And the validation set serves as self-checking data for the network's learning process.Furthermore, a separate test set consisting of 100 images containing Shen Embroidery was collected to evaluate the performance of the trained model.
The experimental platform of this article was based on the Windows 10 operating system.Programming was done by using the Python language under Anaconda.The deep learning framework used was PyTorch.The main hardware configuration of the computer used for the experiments includes an 8 GB GPU GeForce GTX  www.nature.com/scientificreports/and fully connected layer (FC).In the network, pooling operations are replaced by depthwise convolutions with a stride of 2, while the global average pooling layer is retained at the end of the network.Additionally, each convolutional layer is followed by a BN layer for batch normalization and ReLU activation function.

Transfer learning
Training the convolutional neural network (CNN) required a large dataset of images.However, in this study, the image of Shen Embroidery lacked obvious texture features such as shape and color.Therefore, high-fidelity images were required as the model dataset to better extract key texture features of Shen Embroidery.However, obtaining sufficient training data is difficult, and the cost of collecting labeled datasets is high 22 .CNN models such as AlexNet, ResNet, and Xception had been trained on large ImageNet datasets for image recognition.These models can recognize different tasks without the need for training from scratch.Pretrained models also aid in network generalization and accelerate convergence.Model fine-tuning refers to unfreezing the top layers of the pretrained model, allowing the learned features to be more relevant to the current task.Transferring the trained model to a new task and training it was known as transfer learning 23 .In this study, due to the limited Shen Embroidery, transfer learning was utilized to avoid excessive training parameters and reduce the risk of overfitting in the network model.Figure 3 illustrated the concept of transfer learning.

Spatial Pyramid Pooling (SPP)
The Spatial Pyramid Pooling (SPP) module drawed inspiration from the spatial pyramid concept and enabled the fusion of local and global features.By integrating local and global features, the expressive power of the feature maps was enhanced, which was beneficial for improving detection accuracy in situations where there are significant variations in object sizes within the image.In the recognition network constructed in this study, the SPP module was incorporated before the feature output layer.The SPP module consisted of three components: max pooling with kernel sizes of 1 × 1, 3 × 3, and 5 × 5, and a concatenate operation.This configuration was illustrated in Fig. 4. The input to the SPP module was a 7 × 7 feature map obtained through convolution, and the output was the concatenation of the three parallel branches.

Evaluation indicators
The evaluation metric for the model in this paper was accuracy (ACC), as shown in Eq. (1): where, TP represents the number of correct identifications of "shenxiu," TN represents the number of correct identifications of "fei," FN represents the number of "shenxiu" mistakenly identified as "fei," and FP represents the number of "fei" mistakenly identified as "shenxiu."

Results
The training results of MobileNet V1 For the recognition of "shenxiu" images, this study utilized the MobileNet V1 model for experimentation.During the experiment, a total of 200 epochs were trained, with a checkpoint saved every 5 epochs.ing that the model is approaching stability, with the loss value eventually stabilizing at around 0.1.However, it is worth noting that there is still significant fluctuation in the loss value during the process of approaching stability, as evident from the graph.

The training results of improved MobileNet V1
For the recognition of "shenxiu" images, this study improved the MobileNet V1 model.The same dataset was fed into the modified MobileNet V1 network for training, with a total of 200 epochs and a checkpoint saved every 5 epochs.Figure 6 shown the training results of the improved MobileNet V1 model.The x-axis represents the training epochs, while the y-axis represents the loss value.The red curve represents the train loss, and the green curve represents the val loss.As the training epochs increase, the loss curves tend to stabilize.After 100 epochs, the curves reached a steady state, indicating a well-fitted model, with the loss value eventually stabilizing at around 0.12.www.nature.com/scientificreports/fluctuation and the best performance during training.Figure 8 shown the loss variation of the five models on the validation set.From the graph, it can be seen that MobileNet V1 also performs well on the validation set.Table 2 represented the ACC of the five models during the training process, with MobileNet V1 achieving an ACC of 96.15%, higher than the other four models, indicating high recognition accuracy.In conclusion, the MobileNet V1 model can be selected as the recognition network for the Shen Embroidery in this study.
After training and fitting, the five models were used to recognize the Shen Embroidery.A dataset of 100 images, including 80 images of Shen Embroidery and 20 images of non-Shen Embroidery, was used for testing.The expected correct recognition results should be "shenxiu" for the 80 images and "fei" for the 20 non-Shen Embroidery images.The resulting confusion matrix was shown in Fig. 9, where (a) represented AlexNet, (b) represented VGG16, (c) represented MobileNet V1, (d) represented InceptionV3, and (e) represented ResNet50.From the confusion matrix, it can be observed that the MobileNet V1 model correctly recognized 98 images and misclassified 2 images of Shen Embroidery.The experimental results indicate that all five models can accurately recognize the Shen Embroidery.However, MobileNet V1 was more accurate in distinguishing between "shenxiu" and non-"shenxiu" images.Therefore, the MobileNet V1 model was selected as the base model for the experiments in this study.

Comparison of the transfer learning before and after
Due to the limited number of artworks in Shen Embroidery, along with an insufficient image dataset, the recognition rate is low.This study introduced transfer learning.In the training process, the pre-trained model of MobileNet V1 was loaded.The dataset, parameters, and experimental equipment were kept unchanged during the training.The experimental results of the transfer learning before and after were shown in Fig. 10.Where, (A) represented the results without transfer learning and (B) represented the results with transfer learning.From the recognition result images, it can be observed that both models, transfer learning before and after, made an error in recognizing the first image by misclassifying the "fei" image as a "shenxiu" image.However, compared to the recognition result for the fourth image, the confidence level of identifying the "fei" image was 55.9% in (A) and 95.9% in (B).This indicated that the model after transfer learning can more accurately recognize "fei" images.The experimental results of the transfer learning before and after were shown in Table 3.The average recognition accuracy of the MobileNet V1 (transfer learning) model was 97.86%, which was 1.11% higher than the MobileNet V1 model.The experimental data shown that the recognition model with transfer learning can more accurately identify "shenxiu" images.

Conclusion
This study proposed an improved MobileNet V1 based on CNN for the recognition of Shen Embroidery in Nantong, a non-material cultural heritage.In this paper, five image classification models were experimented and compared.Ultimately, the MobileNet V1 model was selected as the recognition network for Shen Embroidery.Furthermore, the experimental dataset was enhanced to address the challenges of limited Shen Embroidery works and low recognition rates.In the experimentation process, transfer learning was applied to the MobileNet V1 network to accelerate the model fitting during training.Finally, the avg pool in the MobileNet V1 network was replaced with SPP to better extract features from Shen Embroidery.The experimental results demonstrated that the improved MobileNet V1 achieved a recognition accuracy of 98.45%, which was 2.3% higher than the original network.This validated that the improved MobileNet V1 can accurately identify Shen Embroidery.Innovative research on Shen Embroidery contributes to the application of artificial intelligence technology in the protection and inheritance of non-material cultural heritage.
incorporated the AlexNet model into the application research of Nantong blue calico, utilizing data augmentation techniques to expand the collected texture samples.Based on the deep learning AlexNet model, the cultural connotations of texture patterns were analyzed, and the AlexNet model achieved high accuracy in classifying patterns of Nantong blue calico with a learning rate of 0.002.

Figure 3 .
Figure 3.The concept of transfer learning.

Figure 4 .
Figure 4.The module of SPP.

Figure 7 .
Figure 7.Comparison of train loss of different models.

Figure 8 .
Figure 8.Comparison of test loss of different models.

Figure 11 .
Figure 11.The recognition results of the Shen embroidery.

Figure 12 .
Figure 12.The recognition results of the non-Shen embroidery.

Table 1 .
The network structure of MobileNet V1.

Table 2 .
Accuracy of different models.

Table 3 .
Experimental results of the transfer learning before and after.